The MDS Queue: Analysing Latency Performance of Codes and Redundant Requests
نویسندگان
چکیده
In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication-based methods at a significantly lower storage cost. In particular, it is well known that MaximumDistance-Separable (MDS) codes, such as Reed-Solomon codes, provide the maximum storage efficiency. While the use of codes for providing improved reliability in archival storage systems, where the data is less frequently accessed (or so-called “cold data”), is well understood, the role of codes in the storage of more frequently accessed and active “hot data”, where latency is the key metric, is less clear. In this paper, we study data storage systems based on MDS codes through the lens of queueing theory, and term this the “MDS queue.” We analytically characterize the latency performance of MDS queues, for which we present insightful scheduling policies that form upper and lower bounds to performance, and show that they are quite tight. Extensive simulations using Monte Carlo methods are also provided and used to validate our theoretical analysis. As a side note, our lower-bound analytical method based on the so-called MDS-Reservation(t) queue, represents an elegant practical scheme that requires the maintenance of considerably smaller state, depending on the parameter t, than that of the full-fledged MDS queue (which corresponds to t =∞), and may be of independent interest in practical systems. Comparisons with replication-based systems reveal that codes provide a superior latency-performance (by up to 70%) than replication. The second part of the paper considers an alternative method of (potentially) reducing latency in data centers, that of sending redundant requests. Here, a request is sent to more servers than required, and is deemed served when any requisite number of servers complete service. Several recent works provide empirical evidence of the benefits of redundant requests in various settings, and in this paper, we aim to analytically characterize the situations when can redundant requests actually help. We show that under the MDS queue model (with exponential service times and negligible costs of cancelling jobs), in a replication-based system, the average latency strictly reduces with more redundancy in the requests, and that under a general MDS code, the average latency is minimized when requests are sent to all servers. To the best of our knowledge, these are the first analytical results that prove the benefits of sending redundant requests.
منابع مشابه
Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy
Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...
متن کاملOptimizing Fsync Performance with Dynamic Queue Depth Adaptation
Recent storage devices such as eMMC and SSD support the command queueing [1] in order to improve the storage I/O bandwidth. The command queueing allows multiple read/write requests to be pending. Since the multi-channel and multi-way architectures of eMMC and SSD can handle multiple requests simultaneously, the command queueing is indispensable technique for them. However, the command queueing ...
متن کاملDevelopment of a Model for Predicting Heart Attack Based on Fog Computing
Introduction: Various studies have demonstrated the benefits of using distributed fog computing for the Internet of Things (IoT). Fog computing has brought cloud computing capabilities such as computing, storage, and processing closer to IoT nodes. The new model of fog and edge computing, compared to cloud computing, provides less latency for data processing by bringing resources closer to user...
متن کاملDevelopment of a Model for Predicting Heart Attack Based on Fog Computing
Introduction: Various studies have demonstrated the benefits of using distributed fog computing for the Internet of Things (IoT). Fog computing has brought cloud computing capabilities such as computing, storage, and processing closer to IoT nodes. The new model of fog and edge computing, compared to cloud computing, provides less latency for data processing by bringing resources closer to user...
متن کاملQueueing with redundant requests: exact analysis
Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a request on multiple servers and wait for the first completion (discarding all remaining copies of the request). However, there is no exact analysis of systems with redundancy. This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of...
متن کامل